Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

li2_nc reader daskified #2985

Merged
merged 4 commits into from
Nov 20, 2024
Merged

Conversation

ClementLaplace
Copy link
Contributor

@ClementLaplace ClementLaplace commented Nov 20, 2024

This pull request enables the daskification of the non accumulated and non transformed dataset.

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.08%. Comparing base (fd2cec6) to head (fa56be5).
Report is 14 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2985      +/-   ##
==========================================
- Coverage   96.10%   96.08%   -0.02%     
==========================================
  Files         377      377              
  Lines       55147    55155       +8     
==========================================
  Hits        52997    52997              
- Misses       2150     2158       +8     
Flag Coverage Δ
behaviourtests 3.94% <0.00%> (-0.01%) ⬇️
unittests 96.18% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@coveralls
Copy link

coveralls commented Nov 20, 2024

Pull Request Test Coverage Report for Build 11934839781

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 5 of 5 (100.0%) changed or added relevant lines in 2 files are covered.
  • 8 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.01%) to 96.193%

Files with Coverage Reduction New Missed Lines %
satpy/tests/utils.py 2 93.16%
satpy/tests/reader_tests/gms/test_gms5_vissr_l1b.py 3 98.67%
satpy/tests/reader_tests/gms/test_gms5_vissr_navigation.py 3 97.18%
Totals Coverage Status
Change from base Build 11815057382: -0.01%
Covered Lines: 53241
Relevant Lines: 55348

💛 - Coveralls

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a comment or two inline.

satpy/readers/li_l2_nc.py Outdated Show resolved Hide resolved
satpy/tests/reader_tests/test_li_l2_nc.py Show resolved Hide resolved
@mraspaud mraspaud merged commit 12c05dc into pytroll:main Nov 20, 2024
16 of 18 checks passed
@ameraner
Copy link
Member

So this PR is checking if an array is not daskified and converts it to a dask array if it's not.

I think it would be important to know why the array was not a dask array in the first place, otherwise we are still possibly loading the data into memory maybe unnecessarily. Tracking back the get_dataset in the base class, the variables are taken from the file here

return self[vpath]

could you please check if they are not daskified here already, and if so, why? And otherwise, if they're always daskified coming from the file, we should find out where in the code they are computed and turned into numpy...

@ClementLaplace
Copy link
Contributor Author

ClementLaplace commented Nov 21, 2024

@ameraner here is the result of my investigation.

The code that you showed me bellow

return self[vpath]

returns either non daskified or daskified xarray DataArray.

The reason of the "daskification" or not is to be found there

According to what I have seen all the non daskified array are in fact cached_data which is caused due to the the length of the array* the lenght of dtype_size is inferior to cache_var_size you can see this operation done there

Then some non daskified array are transformed into daskify in this part of the code it is depending of the transformation made.

@ameraner
Copy link
Member

Hi @ClementLaplace , thanks for the analysis! I see, it's quite as @gerritholl was suspecting :) Ok, then I would say it is ok to keep this implementation, in order to make sure that we have homogenised dask array outputs for dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reading LI L2 point data is not daskified
4 participants